log-supermodular distribution
Parameter Learning for Log-supermodular Distributions
We consider log-supermodular models on binary variables, which are probabilistic models with negative log-densities which are submodular. These models provide probabilistic interpretations of common combinatorial optimization tasks such as image segmentation. In this paper, we focus primarily on parameter estimation in the models from known upper-bounds on the intractable log-partition function. We show that the bound based on separable optimization on the base polytope of the submodular function is always inferior to a bound based on ``perturb-and-MAP'' ideas. Then, to learn parameters, given that our approximation of the log-partition function is an expectation (over our own randomization), we use a stochastic subgradient technique to maximize a lower-bound on the log-likelihood. This can also be extended to conditional maximum likelihood. We illustrate our new results in a set of experiments in binary image denoising, where we highlight the flexibility of a probabilistic model to learn with missing data.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The authors present a variational approach L-FIELD to general log-submodular and supermodular distributions. Theoretical contributions include deriving upper and lower bounds on the log-partition function and fully factorized approximate posteriors. The quality of the approximation is tested with respect to the curvature of the function. Empirical results are presented on GMM cuts and MRFs, decomposable functions and facility location modeling.
- Summary/Review (0.50)
- Overview (0.37)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
Reviews: Parameter Learning for Log-supermodular Distributions
Technical quality: The math seems correct, though in general showing a few more intermediate steps in derivations would make the work of the reader easier (can push the proof to the supplement to make room). In the proof of Proposition 1 for instance, it would be nice to have more detail on how the final equality is obtained. The experiments compare to an SVM baseline, but not to the parameter learning done in [9]. Specifically, [9] does parameter learning for the spin glass model and shows that it achieves an error of 1.8% compared to an SVM's 8.2%. What is different about the learning done in this work (besides the use of a different probabilistic model)?
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
Parameter Learning for Log-supermodular Distributions
Shpakova, Tatiana, Bach, Francis
We consider log-supermodular models on binary variables, which are probabilistic models with negative log-densities which are submodular. These models provide probabilistic interpretations of common combinatorial optimization tasks such as image segmentation. In this paper, we focus primarily on parameter estimation in the models from known upper-bounds on the intractable log-partition function. We show that the bound based on separable optimization on the base polytope of the submodular function is always inferior to a bound based on perturb-and-MAP'' ideas. Then, to learn parameters, given that our approximation of the log-partition function is an expectation (over our own randomization), we use a stochastic subgradient technique to maximize a lower-bound on the log-likelihood.
From MAP to Marginals: Variational Inference in Bayesian Submodular Models
Djolonga, Josip, Krause, Andreas
Submodular optimization has found many applications in machine learning and beyond. We carry out the first systematic investigation of inference in probabilistic models defined through submodular functions, generalizing regular pairwise MRFs and Determinantal Point Processes. In particular, we present L-Field, a variational approach to general log-submodular and log-supermodular distributions based on sub- and supergradients. We obtain both lower and upper bounds on the log-partition function, which enables us to compute probability intervals for marginals, conditionals and marginal likelihoods. We also obtain fully factorized approximate posteriors, at the same computational cost as ordinary submodular optimization. Our framework results in convex problems for optimizing over differentials of submodular functions, which we show how to optimally solve. We provide theoretical guarantees of the approximation quality with respect to the curvature of the function. We further establish natural relations between our variational approach and the classical mean-field method. Lastly, we empirically demonstrate the accuracy of our inference scheme on several submodular models.
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)